<html>
<head>
<link rel="shortcut icon" href="./favicon.ico">
<link rel="stylesheet" type="text/css" href="./style.css">
<link rel="canonical" href="./Divider_Integer_Signed_by_Powers_of_Two_Multiprecision.html">
<meta name="viewport" content="width=device-width, initial-scale=1">
<meta name="description" content="Correctly divides a *signed* integer by a power of two. Uses multiprecision arithmetic to allow working on larger integers at higher speed.">
<title>Divider Integer Signed by Powers of Two Multiprecision</title>
</head>
<body>

<p class="inline bordered"><b><a href="./Divider_Integer_Signed_by_Powers_of_Two_Multiprecision.v">Source</a></b></p>
<p class="inline bordered"><b><a href="./legal.html">License</a></b></p>
<p class="inline bordered"><b><a href="./index.html">Index</a></b></p>

<h1>Signed Integer Divider by Powers of Two (Truncating Division), Multiprecision</h1>
<p>Correctly divides a <em>signed</em> integer by a power of two. Uses multiprecision
 arithmetic to allow working on larger integers at higher speed.</p>
<p>This implementation is rather complex, with multiple parallel
 computation pipelines. Go read the original <a href="./Divider_Integer_Signed_by_Powers_of_Two.html">Divider By Powers Of
 Two</a> implementation to more
 easily understand the (identical) algorithm. </p>
<h2>Theory of Operation</h2>
<p>While a left shift by N is always equivalent to a multiplication by
 2<sup>N</sup> for both signed and unsigned binary integers, an arithmetic
 shift right by N is only a truncating division by 2<sup>N</sup> for
 <em>positive</em> binary integers. For negative integers, the result is a so-called
 modulus division, and the quotient ends up off by one in magnitude, and
 must be corrected by adding +1, <em>but only if an odd number results as part
 of the intermediate division steps</em>.</p>
<p>The implementation is based on the PowerPC method, as described in <a
 href="./reading.html#Warren2013">Hacker's Delight</a>, Section 10-1, <em>"Signed
 Division by a Known Power of Two"</em>: We perform the right shift and take
 note if any 1-bits are shifted out. If so, add one to the shifted value.</p>
<p>Since we divide only by powers of two, a division by zero cannot happen.
 (i.e.: there exists no N where 2<sup>N</sup> = 0) Also, we only allow
 positive exponents. A negative exponent would imply a multiplication, and
 that can be done directly with a left shift and not all this complication.</p>
<p>Note that shifting by more than the WORD_WIDTH, with an exponent of value
 greater than WORD_WIDTH, will give a nonsense result for negative numbers
 as we only have WORD_WIDTH sign bits to shift in at most. </p>
<h2>Latency</h2>
<p>Since the result latency depends on the ratio of <code>WORD_WIDTH</code> to
 <code>STEP_WORD_WIDTH</code> and whether that ratio is a whole integer, the inputs are
 set with a ready/valid handshake, and can be updated after the output
 handshake completes. Adjust <code>PIPE_DEPTH</code> also as needed to meet timing
 across the Bit Shifters.</p>
<h2>Parameters, Ports, and Constants</h2>

<pre>
`default_nettype none

module <a href="./Divider_Integer_Signed_by_Powers_of_Two_Multiprecision.html">Divider_Integer_Signed_by_Powers_of_Two_Multiprecision</a>
#(
    parameter WORD_WIDTH        = 0,
    parameter STEP_WORD_WIDTH   = 0,
    parameter PIPE_DEPTH        = -1
)
(
    input   wire                        clock,
    input   wire                        clock_enable,
    input   wire                        clear,

    input   wire                        input_valid,
    output  wire                        input_ready,

    input  wire signed [WORD_WIDTH-1:0] numerator,
    input  wire        [WORD_WIDTH-1:0] exponent_of_two,

    output  wire                        output_valid,
    input   wire                        output_ready,

    output wire signed [WORD_WIDTH-1:0] quotient,
    output wire signed [WORD_WIDTH-1:0] remainder
);
</pre>

<p>We depend on automatic width extension of the WORD_WIDTH integer here, as
 doing it using a loop in an initial block is worse for linting and CAD
 warnings.  I normally don't allow automatic width extension, but in this
 case it will always work, as the register will always store up to
 2<sup>N</sup>-1 for a WORD_WIDTH of N, and N is always unsigned.  This
 width extension is necessary to match port widths later on. We do have to
 silence the linter, though.</p>

<pre>
    // verilator lint_off WIDTH
    reg [WORD_WIDTH-1:0] WORD_WIDTH_LONG = WORD_WIDTH;
    // verilator lint_on  WIDTH

    localparam NEGATIVE  = 1'b1;
    localparam WORD_ZERO = {WORD_WIDTH{1'b0}};
</pre>

<h2>Inputs</h2>
<p>Fork the inputs to separate calculation pipelines.  We structure the
 calculations as initial calculations followed by corrections, for both
 quotient and remainder each.</p>

<pre>
    wire                    division_fork_valid;
    wire                    division_fork_ready;
    wire [WORD_WIDTH-1:0]   division_numerator;
    wire [WORD_WIDTH-1:0]   division_exponent;

    wire                    quotient_correction_fork_valid;
    wire                    quotient_correction_fork_ready;
    wire [WORD_WIDTH-1:0]   quotient_correction_numerator;
    wire [WORD_WIDTH-1:0]   quotient_correction_exponent;

    wire                    remainder_shift_amount_fork_valid;
    wire                    remainder_shift_amount_fork_ready;
    // verilator lint_off UNUSED
    wire [WORD_WIDTH-1:0]   remainder_shift_amount_numerator;
    // verilator lint_on  UNUSED
    wire [WORD_WIDTH-1:0]   remainder_shift_amount_exponent;

    wire                    remainder_correction_fork_valid;
    wire                    remainder_correction_fork_ready;
    wire [WORD_WIDTH-1:0]   remainder_correction_numerator;
    // verilator lint_off UNUSED
    wire [WORD_WIDTH-1:0]   remainder_correction_exponent;
    // verilator lint_on  UNUSED

    <a href="./Pipeline_Fork_Eager.html">Pipeline_Fork_Eager</a>
    #(
        .WORD_WIDTH     (WORD_WIDTH * 2),
        .OUTPUT_COUNT   (4)
    )
    input_fork
    (
        .clock          (clock),
        .clear          (clear),

        .input_valid    (input_valid),
        .input_ready    (input_ready),
        .input_data     ({numerator, exponent_of_two}),

        .output_valid   ({division_fork_valid,                   quotient_correction_fork_valid,                              remainder_shift_amount_fork_valid,                                 remainder_correction_fork_valid}),
        .output_ready   ({division_fork_ready,                   quotient_correction_fork_ready,                              remainder_shift_amount_fork_ready,                                 remainder_correction_fork_ready}),
        .output_data    ({division_numerator, division_exponent, quotient_correction_numerator, quotient_correction_exponent, remainder_shift_amount_numerator, remainder_shift_amount_exponent, remainder_correction_numerator, remainder_correction_exponent})
    );
</pre>

<h2>Uncorrected Division</h2>
<p>Do the initial, uncorrected division.  The quotient may or may not be off
 by one at this point. It is corrected later.  The remainder is "short"
 because all its significant bits are at the left.  We will shift them, with
 sign extension, back to the right later.</p>
<p>Prepare for a positive or negative numerator, which affects division
 calculations.</p>

<pre>
    reg                  division_numerator_sign = 1'b0;
    reg [WORD_WIDTH-1:0] division_sign_extension = WORD_ZERO;

    always @(*) begin
        division_numerator_sign = division_numerator [WORD_WIDTH-1];
        division_sign_extension = {WORD_WIDTH{division_numerator_sign}};
    end

    wire                    uncorrected_division_valid;
    wire                    uncorrected_division_ready;
    wire [WORD_WIDTH-1:0]   uncorrected_quotient;
    wire [WORD_WIDTH-1:0]   uncorrected_remainder;

    <a href="./Bit_Shifter_Pipelined.html">Bit_Shifter_Pipelined</a>
    #(
        .WORD_WIDTH         (WORD_WIDTH),
        .PIPE_DEPTH         (PIPE_DEPTH)
    )
    uncorrected_division
    (
        .clock              (clock),
        .clear              (clear),

        .input_valid        (division_fork_valid),
        .input_ready        (division_fork_ready),

        .word_in_left       (division_sign_extension),
        .word_in            (division_numerator),
        .word_in_right      (WORD_ZERO),

        .shift_amount       (division_exponent),
        .shift_direction    (1'b1),             // 0/1 -> left/right

        .output_valid       (uncorrected_division_valid),
        .output_ready       (uncorrected_division_ready),

        // verilator lint_off PINCONNECTEMPTY
        .word_out_left      (),
        // verilator lint_on  PINCONNECTEMPTY
        .word_out           (uncorrected_quotient),
        .word_out_right     (uncorrected_remainder)
    );
</pre>

<p>The outputs of the uncorrected division are then forked to both the
 quotient correction and remainder shifting calculations.</p>

<pre>
    wire                    uncorrected_quotient_fork_valid;
    wire                    uncorrected_quotient_fork_ready;
    wire [WORD_WIDTH-1:0]   uncorrected_quotient_quotient;
    wire [WORD_WIDTH-1:0]   uncorrected_remainder_quotient;

    wire                    uncorrected_remainder_fork_valid;
    wire                    uncorrected_remainder_fork_ready;
    // verilator lint_off UNUSED
    wire [WORD_WIDTH-1:0]   uncorrected_quotient_remainder;
    // verilator lint_on  UNUSED
    wire [WORD_WIDTH-1:0]   uncorrected_remainder_remainder;

    <a href="./Pipeline_Fork_Eager.html">Pipeline_Fork_Eager</a>
    #(
        .WORD_WIDTH     (WORD_WIDTH * 2),
        .OUTPUT_COUNT   (2)
    )
    uncorrected_division_fork
    (
        .clock          (clock),
        .clear          (clear),

        .input_valid    (uncorrected_division_valid),
        .input_ready    (uncorrected_division_ready),
        .input_data     ({uncorrected_quotient, uncorrected_remainder}),

        .output_valid   ({uncorrected_quotient_fork_valid,                               uncorrected_remainder_fork_valid}),
        .output_ready   ({uncorrected_quotient_fork_ready,                               uncorrected_remainder_fork_ready}),
        .output_data    ({uncorrected_quotient_quotient, uncorrected_remainder_quotient, uncorrected_quotient_remainder, uncorrected_remainder_remainder})
    );
</pre>

<p>Then re-join the uncorrected division results with some of the original
 input data (numerator), forked for the quotient correction calculation.</p>

<pre>
    wire                    quotient_correction_valid;
    wire                    quotient_correction_ready;
    wire [WORD_WIDTH-1:0]   quotient_correction_quotient;
    wire [WORD_WIDTH-1:0]   quotient_correction_remainder;
    wire [WORD_WIDTH-1:0]   quotient_correction_numerator_joined;
    // verilator lint_off UNUSED
    wire [WORD_WIDTH-1:0]   quotient_correction_exponent_joined;
    // verilator lint_on  UNUSED

    <a href="./Pipeline_Join.html">Pipeline_Join</a>
    #(
        .WORD_WIDTH     (WORD_WIDTH * 2),
        .INPUT_COUNT    (2)
    )
    quotient_correction_join
    (
        .clock          (clock),
        .clear          (clear),

        .input_valid    ({uncorrected_quotient_fork_valid,                               quotient_correction_fork_valid}),
        .input_ready    ({uncorrected_quotient_fork_ready,                               quotient_correction_fork_ready}),
        .input_data     ({uncorrected_quotient_quotient, uncorrected_remainder_quotient, quotient_correction_numerator, quotient_correction_exponent}),

        .output_valid   (quotient_correction_valid),
        .output_ready   (quotient_correction_ready),
        .output_data    ({quotient_correction_quotient, quotient_correction_remainder, quotient_correction_numerator_joined, quotient_correction_exponent_joined})
    );
</pre>

<h2>Quotient Correction</h2>
<p>We need to know if at any point during the shift, a 1-bit was shifted into
 the remainder, indicating an odd-valued intermediate result, and thus an
 off-by-one error in the quotient. A simple OR-reduction works because we
 had primed that part of the shift with zeros.</p>

<pre>
    reg odd_intermediate_result = 1'b0;

    always @(*) begin
        odd_intermediate_result = (quotient_correction_remainder != WORD_ZERO);
    end
</pre>

<p>Now, if the numerator was negative, and there was an odd-valued
 intermediate result, let's add +1 to the uncorrected_quotient to bring it
 back to the result a truncating division would give us.</p>

<pre>
    reg uncorrected_numerator_sign = 1'b0;
    reg correction                 = 1'b0;

    always @(*) begin
        uncorrected_numerator_sign = quotient_correction_numerator_joined [WORD_WIDTH-1];
        correction                 = (uncorrected_numerator_sign == NEGATIVE) && (odd_intermediate_result == 1'b1);
    end

    wire [WORD_WIDTH-1:0] correction_extended;

    <a href="./Width_Adjuster.html">Width_Adjuster</a>
    #(
        .WORD_WIDTH_IN  (1),
        .SIGNED         (0),
        .WORD_WIDTH_OUT (WORD_WIDTH)
    )
    extended_correction
    (
        // It's possible some input bits are truncated away
        // verilator lint_off UNUSED
        .original_input     (correction),
        // verilator lint_on  UNUSED
        .adjusted_output    (correction_extended)
    );

    wire                    quotient_internal_valid;
    wire                    quotient_internal_ready;
    wire [WORD_WIDTH-1:0]   quotient_internal;

    <a href="./Adder_Subtractor_Binary_Multiprecision.html">Adder_Subtractor_Binary_Multiprecision</a>
    #(
        .WORD_WIDTH         (WORD_WIDTH),
        .STEP_WORD_WIDTH    (STEP_WORD_WIDTH)
    )
    quotient_correction
    (
        .clock          (clock),
        .clock_enable   (clock_enable),
        .clear          (clear),

        .input_valid    (quotient_correction_valid),
        .input_ready    (quotient_correction_ready),

        .add_sub        (1'b0), // 0/1 -> A+B/A-B
        .A              (quotient_correction_quotient),
        .B              (correction_extended),

        .output_valid   (quotient_internal_valid),
        .output_ready   (quotient_internal_ready),

        .sum            (quotient_internal),
        // verilator lint_off PINCONNECTEMPTY
        .carry_out      (),
        .carries        (),
        .overflow       ()
        // verilator lint_on  PINCONNECTEMPTY
    );
</pre>

<h2>Shift Remainder</h2>
<p>To shift the short remainder back to the right, we need to shift by the
 remainder of the distance to the right, which is WORD_WIDTH
 - exponent_of_two.</p>

<pre>
    wire                    remainder_shift_amount_valid;
    wire                    remainder_shift_amount_ready;
    wire [WORD_WIDTH-1:0]   remainder_shift_amount;

    <a href="./Adder_Subtractor_Binary_Multiprecision.html">Adder_Subtractor_Binary_Multiprecision</a>
    #(
        .WORD_WIDTH         (WORD_WIDTH),
        .STEP_WORD_WIDTH    (STEP_WORD_WIDTH)
    )
    remainder_alignment
    (
        .clock          (clock),
        .clock_enable   (clock_enable),
        .clear          (clear),

        .input_valid    (remainder_shift_amount_fork_valid),
        .input_ready    (remainder_shift_amount_fork_ready),

        .add_sub        (1'b1), // 0/1 -> A+B/A-B
        .A              (WORD_WIDTH_LONG),
        .B              (remainder_shift_amount_exponent),

        .output_valid   (remainder_shift_amount_valid),
        .output_ready   (remainder_shift_amount_ready),

        .sum            (remainder_shift_amount),
        // verilator lint_off PINCONNECTEMPTY
        .carry_out      (),
        .carries        (),
        .overflow       ()
        // verilator lint_on  PINCONNECTEMPTY
    );
</pre>

<p>Then join the short remainder from the uncorrected division with the
 remainder shift amount and the original numerator in preparation for the
 remainder correction calculation.</p>

<pre>
    wire                    remainder_join_valid;
    wire                    remainder_join_ready;
    wire [WORD_WIDTH-1:0]   remainder_shift_amount_joined;
    wire [WORD_WIDTH-1:0]   short_remainder_remainder_joined;
    wire [WORD_WIDTH-1:0]   remainder_correction_numerator_joined;

    <a href="./Pipeline_Join.html">Pipeline_Join</a>
    #(
        .WORD_WIDTH     (WORD_WIDTH),
        .INPUT_COUNT    (3)
    )
    remainder_join
    (
        .clock          (clock),
        .clear          (clear),

        .input_valid    ({remainder_shift_amount_valid, uncorrected_remainder_fork_valid, remainder_correction_fork_valid}),
        .input_ready    ({remainder_shift_amount_ready, uncorrected_remainder_fork_ready, remainder_correction_fork_ready}),
        .input_data     ({remainder_shift_amount,       uncorrected_remainder_remainder,  remainder_correction_numerator}),

        .output_valid   (remainder_join_valid),
        .output_ready   (remainder_join_ready),
        .output_data    ({remainder_shift_amount_joined, short_remainder_remainder_joined, remainder_correction_numerator_joined})
    );
</pre>

<p>Let's shift the remainder significant bits back to the right, with
 the same sign extension as the quotient if the remainder was not zero.</p>

<pre>
    reg                  remainder_correction_numerator_sign = 1'b0;
    reg [WORD_WIDTH-1:0] remainder_correction_sign_extension = WORD_ZERO;

    always @(*) begin
        remainder_correction_numerator_sign = remainder_correction_numerator_joined [WORD_WIDTH-1];
        remainder_correction_sign_extension = {WORD_WIDTH{remainder_correction_numerator_sign}};
    end

    wire                    remainder_internal_valid;
    wire                    remainder_internal_ready;
    wire [WORD_WIDTH-1:0]   remainder_internal;

    <a href="./Bit_Shifter_Pipelined.html">Bit_Shifter_Pipelined</a>
    #(
        .WORD_WIDTH         (WORD_WIDTH),
        .PIPE_DEPTH         (PIPE_DEPTH)
    )
    remainder_correction
    (
        .clock              (clock),
        .clear              (clear),

        .input_valid        (remainder_join_valid),
        .input_ready        (remainder_join_ready),

        .word_in_left       (remainder_correction_sign_extension),
        .word_in            (short_remainder_remainder_joined),
        .word_in_right      (WORD_ZERO),

        .shift_amount       (remainder_shift_amount_joined),
        .shift_direction    (1'b1), // 0/1 -> left/right

        .output_valid       (remainder_internal_valid),
        .output_ready       (remainder_internal_ready),

        // verilator lint_off PINCONNECTEMPTY
        .word_out_left      (),
        .word_out           (remainder_internal),
        .word_out_right     ()
        // verilator lint_on  PINCONNECTEMPTY
    );
</pre>

<h2>Outputs</h2>
<p>Finally, join the corrected remainder and quotient together into the final
 result.</p>

<pre>
    <a href="./Pipeline_Join.html">Pipeline_Join</a>
    #(
        .WORD_WIDTH     (WORD_WIDTH),
        .INPUT_COUNT    (2)
    )
    output_join
    (
        .clock          (clock),
        .clear          (clear),

        .input_valid    ({remainder_internal_valid, quotient_internal_valid}),
        .input_ready    ({remainder_internal_ready, quotient_internal_ready}),
        .input_data     ({remainder_internal,       quotient_internal}),

        .output_valid   (output_valid),
        .output_ready   (output_ready),
        .output_data    ({remainder, quotient})
    );

endmodule
</pre>

<hr>
<p><a href="./index.html">Back to FPGA Design Elements</a>
<center><a href="https://fpgacpu.ca/">fpgacpu.ca</a></center>
</body>
</html>

